In this analysis different portfolios consisting of stocks that are among the top 10 constituents of the MSCI Spain Index are backtested using the functionality of the portvine package. First the most important packages are loaded and after that the data will be imported and discussed shortly.
# main workhorse for the estimation of risk measures
library(portvine)
# general data wrangling and visualizations
library(tidyverse)
# arrange ggplots nicely
library(patchwork)
# utility color vector for visualizations
custom_colors <- c("#92B8DE", "#db4f59", "#477042", "#cc72d6")
theme_set(
theme_minimal() +
theme(plot.title = ggtext::element_markdown(size = 11),
plot.subtitle = ggtext::element_markdown(size = 9))
)
Sys.setlocale("LC_TIME", "English")
# load the data
load(here::here("data", "msci_spain_data_clean.RData"))
# source utils
source(here::here("analysis_utils.R"))
glimpse(msci_spain_complete_data)
## Rows: 1,695
## Columns: 13
## $ date <dttm> 2015-05-11, 2015-05-12, 2015-05-13, 2015-05-14, 2015~
## $ msci_spain_index <dbl> -0.0003077472, -0.0082391913, -0.0029852937, 0.004217~
## $ iberdrola <dbl> 0.0044134899, -0.0011447704, 0.0084790041, 0.00438339~
## $ banco_santander <dbl> -0.0013274410, -0.0148344281, -0.0035906090, 0.009393~
## $ inditex <dbl> -0.004312954, -0.004331636, -0.006969884, 0.012165600~
## $ cellnex_telecom <dbl> -0.016530307, 0.006643209, 0.026148592, 0.019166054, ~
## $ repsol_ypf <dbl> 0.0002758643, -0.0041296843, -0.0024876548, 0.0046928~
## $ ferrovial <dbl> 0.0065986178, -0.0099143271, 0.0106753099, 0.00580057~
## $ amadeus_it_group <dbl> 0.0080103947, 0.0107436023, -0.0065789711, -0.0036737~
## $ telefonica <dbl> 0.0057893368, -0.0405038229, -0.0003810002, -0.010201~
## $ bbv_argentaria <dbl> -0.0042779936, -0.0055121217, -0.0008861844, 0.013735~
## $ sp500 <dbl> 0.0029540152, 0.0003049362, -0.0107215489, -0.0007681~
## $ eurostoxx50 <dbl> -0.006893194, -0.014257898, -0.005523096, 0.000000000~
One can see that there are 13 columns. The date column gives obviously the date as daily return data will be analyzed here. The daily log returns of the overall index are given in the column msci_spain_index, eurostoxx50 contains the corresponding daily log returns of the Eurostoxx 50 index, sp500 the daily log return of the SP500 index and all other columns contain the daily log returns of the respective stocks within the MSCI spain.
summary(msci_spain_complete_data$date)
## Min. 1st Qu. Median
## "2015-05-11 00:00:00" "2016-12-22 12:00:00" "2018-08-08 00:00:00"
## Mean 3rd Qu. Max.
## "2018-08-08 09:41:05" "2020-03-24 12:00:00" "2021-11-08 00:00:00"
nrow(msci_spain_complete_data)
## [1] 1695
This means that the time frame is from the 11th Mai of 2015 until the 8th November of 2021 and there are 1695 observations. Below one can have a look at daily log returns of the overall MSCI Spain index.
So one can observe greater volatility during the stock market selloffs 2015-2016 maybe due to Chinese stock market turbulence, the EU dept crisis and the Brexit votum as well as for the first ‘Covid-19 year’ 2020. The period with the pandemic in place allows to test the backtesting in a higher volatility setting. Thus one will specify the following two time frames of interest.
msci_spain_16_19msci_spain_20_21As all risk estimations are based on the rather high sample size \(S\) of 100000 all risk estimations were performed remotely at the Leibniz supercomputing centre and the results saved. This allows to exploit the nested parallelization provided by the portvine package.
The portfolio is here given via 9 assets:
The portfolio is weighted according to their respective market capitalization on the 29.10.2021.
# weights according to the market capitalization on the 29.10.21
weights_values <- c(
iberdrola = 60.48,
banco_santander = 56.82,
bbv_argentaria = 40.42,
inditex = 34.08,
cellnex_telecom = 27.09,
amadeus_it_group = 26.06,
telefonica = 19.37,
repsol_ypf = 16.04,
ferrovial = 12.99
)
One always estimates all available risk measures i.e. VaR and the ES with all three estimation methods (mean, median and monte carlo integration based on 10000 samples). The \(\alpha\) levels estimated are 0.01, 0.025, 0.05 and 0.95.
For the marginal models one uses as a default the commonly used ARMA(1,1)-GARCH(1,1) model. If a lack of fit is detected one can adjust this. The training window size for the marginal windows is fixed to 750 days which are roughly 3 financial years in the first time frame and to 300 days in the second time frame. For both time frames the refitting frequency is 50 days which are roughly 2 financial months.
In the unconditional case one allows for the general class of R-vine copulas and uses all available parametric bivariate building blocks. For the vine training size different specifications will be evaluated. For the first time frame one tries the vine training window lengths of 250 and 500 and for the second one 100 and 200. For both time frames one tries the two refitting frequencies of 25 and 50.
# load the unconditional models
load(here::here("data", "msci_spain_uncond_res.RData"))
load(here::here("data", "msci_spain_uncond_gausst.RData"))
Display the summary of two of the models.
summary(uncond_risk_roll_16_19_g50_k25_p250)
## An object of class <portvine_roll>
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 6
## Train size: 750
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 12
## Train size: 250
## Refit size: 25
## Vine copula type: rvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 4672
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 6.7735 minutes.
summary(uncond_risk_roll_20_21_g50_k25_p200)
## An object of class <portvine_roll>
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 2
## Train size: 300
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 4
## Train size: 200
## Refit size: 25
## Vine copula type: rvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 1600
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 5.9738 minutes.
As the marginal models are the same for each time frame one has a look at the the respective models. First the time frame from 2016 up to 2019. One only has a look at the model corresponding to the first marginal window which will be done here for simplicity but it might also in practice be a reasonable approach as one then only uses the training data available before starting the rolling window risk estimation.
marg_viz_uncond_risk_roll_16_19 <- marg_resid_viz_list(
uncond_risk_roll_16_19_g50_k25_p250
)
marg_viz2_uncond_risk_roll_16_19 <- marg_resid_viz_list(
uncond_risk_roll_16_19_g50_k25_p250, squared = TRUE
)
for (asset in names(marg_viz_uncond_risk_roll_16_19)) {
print(marg_viz_uncond_risk_roll_16_19[[asset]])
print(marg_viz2_uncond_risk_roll_16_19[[asset]])
}
A more comprehensive yet less informative visualization is achieved via the heatmap below.
ljung_heatmap(uncond_risk_roll_16_19_g50_k25_p250)
The interactive version even allows to see how the marginal models of all the marginal windows behave w.r.t. the Ljung Box test.
ljung_heatmap_animation(uncond_risk_roll_16_19_g50_k25_p250)
Thus it is quite obvious that the default specification of an ARMA(1,1)-GARCH(1,1) model does a good job. The only two questionable marginal models would be Ferrovial and Cellnex. But as the quality breaches only appear beyond the third lag one will for now hold on to the default model.
Next up the second time frame from 02.04.2020 - 13.10.2021.
marg_viz_uncond_risk_roll_20_21 <- marg_resid_viz_list(
uncond_risk_roll_20_21_g50_k25_p200
)
marg_viz2_uncond_risk_roll_20_21 <- marg_resid_viz_list(
uncond_risk_roll_20_21_g50_k25_p200, squared = TRUE
)
for (asset in names(marg_viz_uncond_risk_roll_20_21)) {
print(marg_viz_uncond_risk_roll_20_21[[asset]])
print(marg_viz2_uncond_risk_roll_20_21[[asset]])
}
A more comprehensive yet less informative visualization is again achieved via the heatmap below.
ljung_heatmap(uncond_risk_roll_20_21_g50_k25_p200)
The interactive version again allows to see how the marginal models of all the marginal windows behave w.r.t. the Ljung Box test.
ljung_heatmap_animation(uncond_risk_roll_20_21_g50_k25_p200)
Here the fit of Banco Santander is the most questionable but as it is a short call one will first stick to the default model for all marginal models.
Next up one can analyze the fitted R-vines. The most interesting questions that arise in this context are which bivariate building blocks were fitted and how strong their dependence is. Also it might be interesting to detect changing patterns over time. Again one starts with the first time frame.
labeled_vinecop_plot(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[1]])
labeled_vinecop_plot(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[6]])
labeled_vinecop_plot(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[12]])
bicops_used(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[1]])
##
## bb7 clayton frank gaussian gumbel indep joe t
## 1 5 12 3 5 7 1 2
bicops_used(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[6]])
##
## bb8 clayton frank gaussian gumbel indep joe t
## 1 5 8 3 6 7 1 5
bicops_used(fitted_vines(uncond_risk_roll_16_19_g50_k25_p250)[[12]])
##
## bb8 clayton frank gaussian gumbel indep joe t
## 1 1 4 6 5 9 5 5
So there is definitely a change over time visible as well as the use of non Student’s t/Gaussian components. The same analysis can be performed on the second time frame.
labeled_vinecop_plot(fitted_vines(uncond_risk_roll_20_21_g50_k25_p200)[[1]])
labeled_vinecop_plot(fitted_vines(uncond_risk_roll_20_21_g50_k25_p200)[[4]])
bicops_used(fitted_vines(uncond_risk_roll_20_21_g50_k25_p200)[[1]])
##
## bb1 bb7 bb8 clayton frank gaussian gumbel indep
## 3 2 1 4 4 1 7 8
## joe t
## 3 3
bicops_used(fitted_vines(uncond_risk_roll_20_21_g50_k25_p200)[[4]])
##
## bb1 bb8 frank gaussian gumbel indep joe t
## 2 2 5 5 5 9 3 5
It is obvious that in this high volatility time frame the use of extreme value copulas is much higher than in the previous time frame. Again the Banco Santander assets seems to have a central role in this portfolio.
As the package presents three different risk measure estimators namely the mean, median and monte carlo integration estimation it is reasonable to compare the estimated risk measures for the two time frames.
For the first time frame this looks as follows.
Also one might compare the number of exceedances.
risk_estimates(uncond_risk_roll_16_19_g50_k25_p250,
risk_measures = c("ES_mean", "ES_median", "ES_mc"),
alpha = 0.05,
exceeded = TRUE) %>%
group_by(risk_measure) %>%
summarise(relative_exceedances = mean(exceeded))
Thus the differences in the estimation are minimal. Here actually we find an indication for the fact that the values that fell below the corresponding VaR where left skewed which leads to the fact that the mean will be a more conservative estimate in those cases which might explain the slightly lower exceedance rate. One can also look at the same comparisons for the second time frame.
Also one might compare the number of exceedances.
risk_estimates(uncond_risk_roll_20_21_g50_k25_p200,
risk_measures = c("ES_mean", "ES_median", "ES_mc"),
alpha = 0.05,
exceeded = TRUE) %>%
group_by(risk_measure) %>%
summarise(relative_exceedances = mean(exceeded))
The same patterns as above are visible. Thus one will stick to the mean estimation approach in the rest of the analysis.
Now the first time frame 2016-19
get_traditional_backtests_uncond(
uncond_risk_roll_16_19_g50_k25_p250,
alphas = c(0.01, 0.025, 0.05)) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_16_19_g50_k50_p250,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_16_19_g50_k25_p500,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_16_19_g50_k50_p500,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
mutate("vine training length" = rep(c(250, 500), each = 16),
"vine window size" = rep(rep(c(25, 50), each = 8), 2))
All p-values are above the common confidence value 5% s.t. the traditional backtests all pass. In some cases i.e. alpha level 0.01 due to limitations of the esback package one has -1 as the replacement when the backtest could not be performed. The closest decisions were the unconditional coverage test at the alpha level 2.5% with vine training length of 250 and window size 50 and the one sided exceedance residual test at the level 5% with vine training length 500 and window size 50. Now display the results of the traditional backtests for the second time frame.
get_traditional_backtests_uncond(
uncond_risk_roll_20_21_g50_k25_p100,
alphas = c(0.01, 0.025, 0.05)) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_20_21_g50_k25_p200,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_20_21_g50_k50_p100,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
bind_rows(
get_traditional_backtests_uncond(
uncond_risk_roll_20_21_g50_k50_p200,
alphas = c(0.01, 0.025, 0.05)
)
) %>%
mutate("vine training length" = rep(c(100, 200), each = 16),
"vine window size" = rep(rep(c(25, 50), each = 8), 2))
Again looks good besides some too conservative estimates at the lowest alpha level where the one sided is highly significant and the two sided not. This means that one is probably too conservative in this case. Depending on the user of the backtest this might even be more desirable.
Visual comparison of the VaR and ES for one risk roll specification of the first time frame.
risk_estimates(uncond_risk_roll_16_19_g50_k25_p250,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey") +
geom_line(aes(x = date, y = risk_est,
col = risk_measure, linetype = risk_measure),
size = 0.5) +
labs(x = "estimation window", y = "portfolio log returns",
linetype = "Risk measure",
title = "Comparison of risk measure behaivior for alpha level 0.01") +
scale_color_manual(values = c(custom_colors[1], custom_colors[3]),
name = "Risk measure")
As always now the second time frame.
risk_estimates(uncond_risk_roll_20_21_g50_k25_p100,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey") +
geom_line(aes(x = date, y = risk_est,
col = risk_measure, linetype = risk_measure),
size = 0.5) +
labs(x = "estimation window (2021)", y = "portfolio log returns",
linetype = "Risk measure",
title = "Comparison of risk measure behaivior for alpha level 0.01") +
scale_color_manual(values = c(custom_colors[1], custom_colors[3]),
name = "Risk measure")
These plots also suggest empirically that the assumption of colinearity of the ES and VaR that is used in the ESR backtesting seems reasonable.
Now some exemplary exceedance plots.
# first time frame
risk_estimates(uncond_risk_roll_16_19_g50_k25_p250,
risk_measures = c("VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey") +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1]) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE, data = . %>% filter(exceeded)) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: VaR with alpha level 0.01"
) +
theme(legend.position = "none")
# second time frame
risk_estimates(uncond_risk_roll_20_21_g50_k25_p100,
risk_measures = c("VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey") +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1]) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE, data = . %>% filter(exceeded)) +
labs(x = "estimation window (2021)",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: VaR with alpha level 0.01"
) +
theme(legend.position = "none")
For the visualization of the ES it is reasonable to not mark the exceedances of the ES but again the ones of the corresponding VaR which is more natural given the definition of the ES. An example is given below.
# first time frame
risk_estimates(uncond_risk_roll_16_19_g50_k25_p250,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey",
data = . %>% filter(risk_measure == "VaR")) +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1],
data = . %>% filter(risk_measure == "ES_mean")) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE,
data = . %>% filter(exceeded & risk_measure == "VaR")) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances of the VaR are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES with alpha level 0.01"
) +
theme(legend.position = "none")
The traditional backtests do not allow for a comparison thus one facilitates the comparative backtest by Nolde and Ziegel (2017) based on scoring functions in order to compare the models.
# the resulting matrix gives the following interpretation
# if the value is less than say nu = 5% then one would argue with the corresponding
# confidence of 95% that the model corresponding to the row performs at least
# as good as the one corresponding to the row. For values between nu and 1-nu
# one can not make a definitive decision but indications are yet visible.
# This three zone approach is the one of Nolde and Ziegel 2017
comparative_backtesting_matrix <- function(roll_list, alpha) {
n_rolls <- length(roll_list)
res <- matrix(rep(NA_real_, n_rolls^2), nrow = n_rolls)
for (i in seq(n_rolls)) {
for (j in seq(n_rolls)) {
if (i != j) {
res[i, j] <- es_comparative_backtest(
roll1 = roll_list[[i]],
roll2 = roll_list[[j]],
alpha = alpha
)[2]
}
}
}
colnames(res) <- names(roll_list)
rownames(res) <- names(roll_list)
res
}
comparative_backtesting_matrix(
list(
k25_p250 = uncond_risk_roll_16_19_g50_k25_p250,
k25_p500 = uncond_risk_roll_16_19_g50_k25_p500,
k50_p250 = uncond_risk_roll_16_19_g50_k50_p250,
k50_p500 = uncond_risk_roll_16_19_g50_k50_p500
),
alpha = 0.01
)
## k25_p250 k25_p500 k50_p250 k50_p500
## k25_p250 NA 0.8127362 0.7170077 0.8455523
## k25_p500 0.1872638 NA 0.2756982 0.5676594
## k50_p250 0.2829923 0.7243018 NA 0.7648079
## k50_p500 0.1544477 0.4323406 0.2351921 NA
Again one first has a look at the first time frame. Here no definite decisions are possible. The results however provide some indications. For example the longer vine training window size of 500 seems to be slightly better than the shorter training window size. Moreover the smaller vine window of 25 does not seem to be necessary. The window size of 50 might even be in cases superior. So overall from the indications one would probably opt for the model with vine training size of 500 and refit size 50.
In this setting one has also estimated the risk measures with only Gaussian and Student’s t components (now referred to as a t-vine). Thus a comparison would be very interesting.
comparative_backtesting_matrix(
list(
all = uncond_risk_roll_16_19_g50_k50_p250,
gaussian_t = uncond_risk_roll_16_19_t
),
alpha = 0.01
)
## all gaussian_t
## all NA 0.7379413
## gaussian_t 0.2620587 NA
There is a slight tendency towards the t-vine model. One can have a look at the respective risk measures visually.
(
risk_estimates(uncond_risk_roll_16_19_g50_k50_p250,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey",
data = . %>% filter(risk_measure == "VaR")) +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1],
data = . %>% filter(risk_measure == "ES_mean")) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE,
data = . %>% filter(exceeded & risk_measure == "VaR")) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances of the VaR are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES with alpha level 0.01, all parametric copula families"
) +
theme(legend.position = "none")
) / (
risk_estimates(uncond_risk_roll_16_19_t,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey",
data = . %>% filter(risk_measure == "VaR")) +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1],
data = . %>% filter(risk_measure == "ES_mean")) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE,
data = . %>% filter(exceeded & risk_measure == "VaR")) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances of the VaR are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES with alpha level 0.01, only Gaussian and Student's t copulas"
) +
theme(legend.position = "none")
)
The difference is also visually quite small. One can just see a slightly more conservative risk measure for the t-vine model.
Next up the second time frame.
comparative_backtesting_matrix(
list(
k25_p100 = uncond_risk_roll_20_21_g50_k25_p100,
k25_p200 = uncond_risk_roll_20_21_g50_k25_p200,
k50_p100 = uncond_risk_roll_20_21_g50_k50_p100,
k50_p200 = uncond_risk_roll_20_21_g50_k50_p200
),
alpha = 0.01
)
## k25_p100 k25_p200 k50_p100 k50_p200
## k25_p100 NA 0.6759593 0.06856834 0.7091422
## k25_p200 0.3240407 NA 0.19989276 0.7861632
## k50_p100 0.9314317 0.8001072 NA 0.7983376
## k50_p200 0.2908578 0.2138368 0.20166243 NA
Here as above a tendency towards the bigger vine training window size is visible. No definitive answer can be given however. There is no clear indication whether the shorter or bigger vine rolling window of 25 or 50 performs better. From these indications one would probably opt for the model with vine training size 200 and refit size 50.
Again for the comparison a t-vine model was fitted.
comparative_backtesting_matrix(
list(
all = uncond_risk_roll_20_21_g50_k50_p200,
gaussian_t = uncond_risk_roll_20_21_t
),
alpha = 0.01
)
## all gaussian_t
## all NA 0.3600806
## gaussian_t 0.6399194 NA
Contrary to the setting in the first time frame one can observe that here there is an indication visible that the general R-vine model does perform better. The decision is however not definitive. Once again a visual inspection is reasonable.
(
risk_estimates(uncond_risk_roll_20_21_g50_k50_p200,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey",
data = . %>% filter(risk_measure == "VaR")) +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1],
data = . %>% filter(risk_measure == "ES_mean")) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE,
data = . %>% filter(exceeded & risk_measure == "VaR")) +
labs(x = "estimation window (2021)",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances of the VaR are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES with alpha level 0.01, all parametric copula families"
) +
theme(legend.position = "none")
) / (
risk_estimates(uncond_risk_roll_20_21_t,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(aes(x = date, y = realized), col = "lightgrey",
data = . %>% filter(risk_measure == "VaR")) +
geom_line(aes(x = date, y = risk_est), col = custom_colors[1],
data = . %>% filter(risk_measure == "ES_mean")) +
geom_point(aes(x = date, y = realized), col = custom_colors[2],
inherit.aes = FALSE,
data = . %>% filter(exceeded & risk_measure == "VaR")) +
labs(x = "estimation window (2021)",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances of the VaR are highlighted in ",
"<span style='color:",
custom_colors[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES with alpha level 0.01, only Gaussian and Student's t copulas"
) +
theme(legend.position = "none")
)
The only considerable difference is that the risk estimates corresponding to the R-vine model are slightly more conservative which might be desirable in a high volatility situation like the ongoing pandemic in 2021.
All in all one can conclude that in both time frames the unconditional risk measure estimation approach worked really well as on all tested confidence levels all traditional backtests passed.
Here one uses the same base portfolio as in the unconditional case but introduces a conditioning variable. One will have a look at the influence of two conditioning market indices namely the SP500 and the Eurostoxx50 indices. They represent the overall American and European market trend.
# load the single conditional models
load(here::here("data", "msci_spain_cond1_1619euro.RData"))
load(here::here("data", "msci_spain_cond1_1619sp500.RData"))
load(here::here("data", "msci_spain_cond1_2021.RData"))
Look at the fitted models for the first time frame.
summary(cond_risk_roll_16_19_sp500)
## An object of class <cond_portvine_roll>
##
## --- Conditional settings ---
## Conditional variable(s): sp500
## Number of conditional estimated risk measures: 46720
## Conditioning quantiles: 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 6
## Train size: 750
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 12
## Train size: 250
## Refit size: 25
## Vine copula type: dvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 4672
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 395.0241 minutes.
summary(cond_risk_roll_16_19_eurostoxx50)
## An object of class <cond_portvine_roll>
##
## --- Conditional settings ---
## Conditional variable(s): eurostoxx50
## Number of conditional estimated risk measures: 46720
## Conditioning quantiles: 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 6
## Train size: 750
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 12
## Train size: 250
## Refit size: 25
## Vine copula type: dvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 4672
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 460.3486 minutes.
And the second time frame.
summary(cond_risk_roll_20_21_sp500)
## An object of class <cond_portvine_roll>
##
## --- Conditional settings ---
## Conditional variable(s): sp500
## Number of conditional estimated risk measures: 16000
## Conditioning quantiles: 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 2
## Train size: 300
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 4
## Train size: 200
## Refit size: 25
## Vine copula type: dvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 1600
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 86.0664 minutes.
summary(cond_risk_roll_20_21_eurostoxx50)
## An object of class <cond_portvine_roll>
##
## --- Conditional settings ---
## Conditional variable(s): eurostoxx50
## Number of conditional estimated risk measures: 16000
## Conditioning quantiles: 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 2
## Train size: 300
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 4
## Train size: 200
## Refit size: 25
## Vine copula type: dvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_mc ES_median
## Alpha levels used: 0.01 0.025 0.05 0.95
## Number of estimated risk measures: 1600
## Number of samples for each risk estimation: 1e+05
##
## Time taken: 97.0685 minutes.
Here actually one only has to check whether the index is well fitted. The marginal models for the portfolio do not change.
marg_viz_cond_risk_roll_16_19sp500 <- marg_resid_viz_list(
cond_risk_roll_16_19_sp500
)
marg_viz2_cond_risk_roll_16_19sp500 <- marg_resid_viz_list(
cond_risk_roll_16_19_sp500, squared = TRUE
)
marg_viz_cond_risk_roll_16_19euro <- marg_resid_viz_list(
cond_risk_roll_16_19_eurostoxx50
)
marg_viz2_cond_risk_roll_16_19euro <- marg_resid_viz_list(
cond_risk_roll_16_19_eurostoxx50, squared = TRUE
)
marg_viz_cond_risk_roll_20_21sp500 <- marg_resid_viz_list(
cond_risk_roll_20_21_sp500
)
marg_viz2_cond_risk_roll_20_21sp500 <- marg_resid_viz_list(
cond_risk_roll_20_21_sp500, squared = TRUE
)
marg_viz_cond_risk_roll_20_21euro <- marg_resid_viz_list(
cond_risk_roll_20_21_eurostoxx50
)
marg_viz2_cond_risk_roll_20_21euro <- marg_resid_viz_list(
cond_risk_roll_20_21_eurostoxx50, squared = TRUE
)
The residual analysis for the conditional variables regarding the first time frame suggest a good fit of the ARMA(1,1)-GARCH(1,1) model as evident below.
marg_viz_cond_risk_roll_16_19sp500$sp500
marg_viz2_cond_risk_roll_16_19sp500$sp500
marg_viz_cond_risk_roll_16_19euro$eurostoxx50
marg_viz2_cond_risk_roll_16_19euro$eurostoxx50
For the second time frame one can observe a very close call for the mean equation of the marginal model for the conditional variable SP500. As the decision is very close one will not adjust the model in this case.
marg_viz_cond_risk_roll_20_21sp500$sp500
marg_viz2_cond_risk_roll_20_21sp500$sp500
marg_viz_cond_risk_roll_20_21euro$eurostoxx50
marg_viz2_cond_risk_roll_20_21euro$eurostoxx50
Here one uses D-vines with the fixed position of the index at the rightmost leaf. One can have a look whether the ordering changes over time and which asset seems to be most influenced by the market index. Again also the fitted bivariate copulas are of interest.
One starts with the first time frame and the SP500 index.
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_sp500)[[1]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_sp500)[[6]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_sp500)[[12]])
bicops_used(fitted_vines(cond_risk_roll_16_19_sp500)[[1]])
##
## bb1 bb8 clayton frank gaussian gumbel indep joe
## 1 1 2 11 4 6 13 4
## t
## 3
bicops_used(fitted_vines(cond_risk_roll_16_19_sp500)[[6]])
##
## bb1 bb8 clayton frank gaussian gumbel indep joe
## 1 2 6 8 3 7 15 1
## t
## 2
bicops_used(fitted_vines(cond_risk_roll_16_19_sp500)[[12]])
##
## bb7 bb8 clayton frank gaussian gumbel indep joe
## 1 2 6 6 4 4 15 4
## t
## 3
The ordering changes quite strongly and quite notably the pairwise dependence in terms of the Kendall’s tau is really weak with the index and the second rightmost asset. Next up one will have a look at the Eurostoxx50 index.
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[1]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[6]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[12]])
bicops_used(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[1]])
##
## bb1 bb7 bb8 clayton frank gaussian gumbel indep
## 2 1 1 4 11 8 5 5
## joe t
## 4 4
bicops_used(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[6]])
##
## bb1 bb7 bb8 clayton frank gaussian gumbel indep
## 1 1 1 2 15 4 10 8
## t
## 3
bicops_used(fitted_vines(cond_risk_roll_16_19_eurostoxx50)[[12]])
##
## bb1 bb8 clayton frank gaussian gumbel indep joe
## 1 1 3 7 4 9 10 6
## t
## 4
Notably and also like expected the pairwise dependence of the index with the second rightmost asset (Banco Santander) is much stronger. The dependence is also much more anticipated as for example Banco Santander, the Amadeus IT Group, Iberdrola and Inditex are included in the index but only with a cumluative weight of roughly 5%. Thus in a stress testing situation one would expect the conditioning on the Eurostoxx50 index to be much more influential. The same analysis is now performed on the second time frame.
labeled_vinecop_plot(fitted_vines(cond_risk_roll_20_21_sp500)[[1]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_20_21_sp500)[[4]])
bicops_used(fitted_vines(cond_risk_roll_20_21_sp500)[[1]])
##
## bb1 bb7 bb8 clayton frank gaussian gumbel indep
## 3 3 1 5 6 2 2 12
## joe t
## 7 4
bicops_used(fitted_vines(cond_risk_roll_20_21_sp500)[[4]])
##
## bb1 bb8 clayton frank gaussian gumbel indep joe
## 2 1 2 7 6 7 13 3
## t
## 4
As in the unconditional case one can detect more extreme value copulas and notably Iberdrola and Inditex are both quite stable in the ordering. Now the Eurostoxx50 index.
labeled_vinecop_plot(fitted_vines(cond_risk_roll_20_21_eurostoxx50)[[1]])
labeled_vinecop_plot(fitted_vines(cond_risk_roll_20_21_eurostoxx50)[[4]])
bicops_used(fitted_vines(cond_risk_roll_20_21_eurostoxx50)[[1]])
##
## bb1 bb7 clayton frank gaussian gumbel indep joe
## 1 5 2 11 5 6 6 5
## t
## 4
bicops_used(fitted_vines(cond_risk_roll_20_21_eurostoxx50)[[4]])
##
## bb1 bb7 bb8 clayton frank gaussian gumbel indep
## 2 2 2 7 6 5 8 7
## joe t
## 4 2
The heavy usage of extreme value copulas especially in the first tree level is directly evident. Now Inditex and the Amadeus IT Group, both constituents of the index, show strong dependencies with the index.
As one now conditions the risk measure estimation on each day on a conditioning value from the conditioning variable it is reasonable to first look at the different conditioning values that were estimated. Here one differentiates between the conditioning values based on quantiles and the ones based on the residual of the day before (prior_resid). One starts again with the first time frame and the conditioning on the SP500.
risk_estimates(cond_risk_roll_16_19_sp500, risk_measures = "ES_mean",
cond_u = seq(0.1, 0.9, 0.1),
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = sp500.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = sp500.x, col = fct_rev(factor(cond_u))),
size = 0.5) +
labs(title = "Conditional variable: SP500", col = "Quantile level",
x = "estimation window", y = "log returns of the SP500")
This is exactly the anticipated behavior. Using a small quantile level \(\alpha^I\) like 0.2 resembles a really bad situation in the American market. The conditional values closely mitigate/ forecast the true behavior of the conditional asset as can be seen below. A slight delay and some exaggerations are clearly visible.
risk_estimates(cond_risk_roll_16_19_sp500, risk_measures = "ES_mean",
cond_u = "prior_resid",
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = sp500.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = sp500.x),
size = 0.5, col = custom_colors[4]) +
labs(title = "Conditional variable: SP500", col = "Quantile level",
x = "estimation window", y = "log returns of the SP500",
subtitle = paste0("The conditional values corresponding to the residual
of the prior day are drawn in ",
"<span style='color:",
custom_colors[4],
"'>**violet**</span>",
".")) +
theme_light() +
ggforce::facet_zoom(x = date >= as.Date("2019-03-01") & date <= as.Date("2019-04-01")) +
theme(panel.border = element_blank(),
plot.subtitle = ggtext::element_markdown(size = 9))
Now the Eurostoxx50 index for the first time frame.
risk_estimates(cond_risk_roll_16_19_eurostoxx50, risk_measures = "ES_mean",
cond_u = seq(0.1, 0.9, 0.1),
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = eurostoxx50.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = eurostoxx50.x, col = fct_rev(factor(cond_u))),
size = 0.5) +
labs(title = "Conditional variable: Eurostoxx 50", col = "Quantile level",
x = "estimation window", y = "log returns of the Eurostoxx 50")
The same pattern as for the other conditional asset is visible. The index however seems to have more extreme market situations.
risk_estimates(cond_risk_roll_16_19_eurostoxx50, risk_measures = "ES_mean",
cond_u = "prior_resid",
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_16_19 %>%
mutate(row_num = 1:nrow(msci_spain_16_19)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = eurostoxx50.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = eurostoxx50.x),
size = 0.5, col = custom_colors[4]) +
labs(title = "Conditional variable: Eurostoxx 50", col = "Quantile level",
x = "estimation window", y = "log returns of the Eurostoxx 50",
subtitle = paste0("The conditional values corresponding to the residual
of the prior day are drawn in ",
"<span style='color:",
custom_colors[4],
"'>**violet**</span>",
".")) +
theme_light() +
ggforce::facet_zoom(x = date >= as.Date("2019-03-01") & date <= as.Date("2019-04-01")) +
theme(panel.border = element_blank(),
plot.subtitle = ggtext::element_markdown(size = 9))
Again the same pattern as for the SP500 is visible. One now presents these visualizations also for the second time frame.
risk_estimates(cond_risk_roll_20_21_sp500, risk_measures = "ES_mean",
cond_u = seq(0.1, 0.9, 0.1),
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_20_21 %>%
mutate(row_num = 1:nrow(msci_spain_20_21)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = sp500.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = sp500.x, col = fct_rev(factor(cond_u))),
size = 0.5) +
labs(title = "Conditional variable: SP500", col = "Quantile level",
x = "estimation window (2021)", y = "log returns of the SP500")
risk_estimates(cond_risk_roll_20_21_sp500, risk_measures = "ES_mean",
cond_u = "prior_resid",
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_20_21 %>%
mutate(row_num = 1:nrow(msci_spain_20_21)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = sp500.y), col = "black", size = .6, alpha = 0.3) +
geom_line(aes(x = date, y = sp500.x),
size = 0.5, col = custom_colors[4]) +
labs(title = "Conditional variable: SP500", col = "Quantile level",
x = "estimation window (2021)", y = "log returns of the SP500",
subtitle = paste0("The conditional values corresponding to the residual
of the prior day are drawn in ",
"<span style='color:",
custom_colors[4],
"'>**violet**</span>",
"."))
The slight delay of the conditional values is clearly visible.
risk_estimates(cond_risk_roll_20_21_eurostoxx50, risk_measures = "ES_mean",
cond_u = seq(0.1, 0.9, 0.1),
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_20_21 %>%
mutate(row_num = 1:nrow(msci_spain_20_21)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = eurostoxx50.y), col = "black", size = .6, alpha = 0.5) +
geom_line(aes(x = date, y = eurostoxx50.x, col = fct_rev(factor(cond_u))),
size = 0.5) +
labs(title = "Conditional variable: Eurostoxx 50", col = "Quantile level",
x = "estimation window (2021)", y = "log returns of the Eurostoxx 50")
risk_estimates(cond_risk_roll_20_21_eurostoxx50, risk_measures = "ES_mean",
cond_u = "prior_resid",
alpha = 0.05, exceeded = TRUE) %>%
left_join(msci_spain_20_21 %>%
mutate(row_num = 1:nrow(msci_spain_20_21)),
by = "row_num") %>%
ggplot() +
geom_line(
aes(x = date, y = eurostoxx50.y), col = "black", size = .6, alpha = 0.4) +
geom_line(aes(x = date, y = eurostoxx50.x),
size = 0.5, col = custom_colors[4]) +
labs(title = "Conditional variable: Eurostoxx 50", col = "Quantile level",
x = "estimation window (2021)", y = "log returns of the Eurostoxx 50",
subtitle = paste0("The conditional values corresponding to the residual
of the prior day are drawn in ",
"<span style='color:",
custom_colors[4],
"'>**violet**</span>",
"."))
Again the same patterns as before are visible.
visualize them have a look at which level they fail the traditional backtests traditional backtest the prior residual risk measure and maybe compare with unconditional